
REMARKS 

Claims 1-41 are pending in this application. By this Amendment, claims 4, 5, 6, 
8, 9, 13, 14, 15, 17, 18, 20, 22, 23, 24, 25, 26, 27, 30, 32, 34, 35, 36, 38, 40 and 41 are 
amended to correct the multiple dependency thereof and to place this application into 
better condition for examination. No new matter is added. 
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Process and Apparatus for In Silico Two-Hybrid Analysis 



Claims 



1. Process for the determination of interacting biomolecules characterized in that similar 
patterns of variation between two or more positions of at least two biomolecules are 
used. 

2. Process for the determination of interacting biomolecules, 
characterized in that 

a ) a first group is provided comprising sequences representing homolo- 
gous biomolecules, 

b) at least one second group is provided comprising sequences repre- 
senting homologous biomolecules, 

c ) group correlation values between the sequences of the first group and 
the sequences of at least one second group are determined, and 

d) the probability of the interaction of the sequence represented bio- 
molecules is determined on the basis of the group correlation values. 

3. Process according to claim 2, characterized in that the probability of the interaction is 
calculated as predicted interaction value. 

4. Process according to claim 2 £r j£ characterized in that the interacting biomolecules 
are those with a positive predicted interaction value. 

5. Process according tojany^damis 2 to H characterized in that any of the second 
group(s) is converted into the first group and the first group is converted into a second 
group and group correlation values between the sequences of this new first group and 
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the sequences of any of the second group(s) which also comprises the former first 
group, are determined. 

6. Process according to^any <$c^ms2 to 5} characterized in that 

site correlation values within each of the sequences within the first group and/or site 
correlation values within each of the sequences within the second group(s) are deter- 
mined and said site correlation values are used for the calculation of the probability of 
interaction and/or for the calculation of the predicted interaction value of the sequence 
represented biomolecules. 

7. Process according to claim 6, characterized in that the site correlation values are cor- 
relation values for substitutions within the sequences 

8. Process according to]any ofcTaims 2 to j), characterized in that 

each sequence of each of said groups is fused to each other to form fused sequences 
comprising at least one sequence of the first group and at least one sequence of any 
second group(s), 

the correlation values within these fused sequences are determined, and 

the correlation values are used as group correlation values for determining the pre- 
dicted interaction value and/or the probability of interaction. 

9. Process according togny of claims 2 to 8jcharactenzed m that 
correlation values are determined by 

creating a position specific matrix containing the distances between pairs of sequences 
at that position whereby the distances are calculated by applying a standard distances 
matrix, 



creating a combined matrix for two positions by calculating the covariation coefficient 
between equivalent positions of their position specific matrices, and 
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determining the correlation value for a pair of positions by averaging the correlation 
values of the combined matrix. 

10. Process according to claim 9, characterized in that the standard distances matrix is the 
scoring matrix by McLachlan. 

1 1 . Method for the determination of interacting biomolecules which comprises processing 

u 

data of at least a first set of data and at least a second set of data to output data 

wherein each of the sets of data comprises independently and individually at least one 
or more elements, 

wherein each of the elements represents the sequence of a biomolecule, 

wherein the elements of a single set of data represent a group of homologous bio- 
molecules, 

wherein the output data comprises at least one pair of elements with one part of the 
pair of elements comprising at least one element from the first set of data and the other 
part of the pair of elements comprising at least one element from the second set of 
data, 

characterised in that 

a group correlation values data set is created comprising group correlation val- 
ues which are determined between the sequences of the first set of data and at least the 
second set of data; 

an interaction probability data set is created by retrieving group correlation 
values from the group correlation values data set and determining the probability of 
interaction of the biomolecules based on the group correlation values; and 
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at least some of the elements from the first and at least the second set of data which 
have been used to create the group correlation values and the interaction probability 
therefrom form the output data. 



12. Method according to claim 11, characterized in that the probability of the interaction is 
calculated as predicted interaction value. 

13. Method according to claim 11 jor l|[ characterized in that the elements the predicted 
interaction value of which is positive, are interacting biomolecules. 

14. Method according togny of claims 11 to 131 characterized in that 

any of second set(s) of data is converted into the first set of data and the first set of 
data is converted into a second set of data, and 

group correlation values are determined between the sequences of this new first set of 
data and the sequences of any of the second set(s). 

15. Method according to]any of cfairr^ Wo 14j characterized in that 

site correlation values within each of the sequences within the first set of data and/or 
site correlation values within each of the sequences within the second set(s) of data are 
determined, and 

said site correlation values form a set-specific site correlation value data set. 

16. Method according to claim 15, characterized in that the set-specific site correlation 
value data set is used to calculate the probability of interaction of and/or to calculate 
the predicted interaction value of the sequence represented biomolecules. 

17. Method according to claim 15^or 1§| characterized in that the site correlation values 
are correlation values for substitutions within the sequences. 

y rVorv. l\ n 

18. Method according totany of claims 1 1 to 111 characterized in that 
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a fused element set of data is generated by combining each element of the first set of 
data individually with each element of any of the second set(s) of data, and 

attributing each fused element individually to the fused element set of data. 

19. Method according to claim 18, characterized in that 

the correlation values are determined within the various positions of a single element 
of the fused element set of data, and 

the correlation values are used as group correlation values for determining the prob- 
ability of the interaction of and/or predicted interaction value(s) of the biomolecules. 

20. Method according to |iny oi^ofclailns 1 1 to 1^, characterized in that the correlation 
values are determined by 

creating a position specific matrix containing the distances between pairs of sequences 
at that position whereby the distances are calculated by applying a standard distances 
matrix, 

creating a combined matrix for two positions by calculating the covariation coefficient 
between equivalent positions of their position specific matrices, and 

determining the correlation value for a pair of positions by averaging the correlation 
values of the combined matrix. 

21. . Method according to claim 20, characterized in that the standard distances matrix is 

the scoring matrix by McLachlan. 

j CVv-\ IV n 

22. Method according to\any of claims 11 to 21{ characterized in that the first set of data 

and/or second the second set(s) of data are retrieved from a medium which is selected 
from the group comprising databanks, linked databanks, textual data and sets of data 
generated by an analytical instrument. 
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23. Method according tofany of claims 11 to 22£ characterized in that the set(s) of data 
comprise alligned sequences. 

24. Method according to£any of claims 1 1 to 2j£ characterized in that the output data are 
output control characters for a target medium. 

25. Method or process according to [any of claims 2 to 241 characterized in that the se- 
quences of the first group or second group(s) or first set of data or second set(s) of data 
are selected from the group comprising DNA sequences, RNA sequences and amino 
acid sequences. 

26. Method or process according to gny of claims 2 to 251 characterized m that the number 
of sequences comprised in any of the groups or any of the sets of data is at least , pref- 
erably at least 11. 

27. Method or process according to {any of claims 2 to 26J characterized in that the se- 
quences are homologous sequences. 

28. Method or process according to claim 27, characterized in that the homologous se- 
quences stem from different origins. 

29. Method or process according to claim 27, characterized in that the homologous se- 
quences in the first set of data and in the second set of data stem from the same origin 
and/or the homologous sequence in the first group and in the second group stem from 
the same origin. 



30. 



TCI 

w IL _ „ claims 27 l to 2gJ characterized in that the ho- 
mologous sequences are homologous genes. 



31. Method or process according to claim 30, characterized in that the homologous genes 
are orthologs. 
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32. Use of the method according to|any of claims 11 to 3l}for the simulation of bio- 
molecule interaction. 

33. Use according to claim 32 wherein the interacting biomolecules are those with a posi- 
tive predicted interaction value determined by a process or method according to any of 
the preceeding claims. 

34. Pairs of interacting biomolecules determined according to a method or process ac- 
cording to jany of thk^auns 2 to 3 lj 

35. Data structure readable by a computer, said data structure being generated by a process 
or a method according to^ny ofcfeims 2 to 3 J. 

36. Computer readable medium for embodying or storing therein data readable by a com- 
puter, said medium comprising one or more of the following: 

a data structure generated by executing a process or a method according to(6ny 
ofclaims2to31J; 

Computer program code means which is adapted to cause a computer to exe- 
cute a process or method according togmy one ohtfauns 2 to 3 % 

37. Computer program product comprising the computer readable medium according to 
claim 36. 

38. Database containing information on interacting sequence pairs generated by applying 
the process or method according to^ny of fiie cJaims 2 to 3^. 

39. Database according to claim 38, wherein the database is an organism/species specific 
database. 

40. Computer system comprising an execution environment for running the process or 

j CVc»v>o — * 

method according to jany of the claims 2 to 3 If. 
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41. Device for simulating the interaction of biomolecules represented by their sequences 
which comprises 

a loading device for making available the sets of data according to^iy of the claims 
1 1 to 3 1] 

a processing device for performing the method according tojany of the claims 1 1 to 3l\ 
an output device for receiving the output data generated by the processing device. 



